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-Abstract 


The  rapid  growth  of  dai:a  base  management  systems  in 
recent  years  has  resulted  in  a  shortage  of  personnel  trained 
in  data  base  concepts.  What  is  needed  at  the  present  time 
is  a  flexible,  inexpensive,  and  accessible  tool  for  teaching 
data  base  concepts.  This  paper  proposes  such  a  tool 
(Educational  Data  Ease  System)  consisting  of  a  data  base 
management  system  embedded  in  the  host  language  APL. 

The  system  supports  two  logical  data  organir at ions  - 
hierarchical  and  relational  -  imposed  on  the  same  storage 
structure.  Software  tables  map  the  logical  data  strucxure 
onto  the  physical  storage  structure  and  provide  the  ability 
to  access  and  manipulate  the  same  logical  data  base  using 
either  hierarchic  or  relational  data  base  ccmraands. 


1  .  Introduction 


Ths  rapid  growth  in  data  base  manageirent  systems  has 
created  a  shortage  of,  and  therefore  a  need  for,  personnel 

trained  in  data  base  concepts.  Thus  a  problem  facing  all 

users  or  potential  users  of  data  base  management  systems,  in 
successfully  implementing  and  utilizing  these  systems,  is 
the  education  of  personnel  so  that  they  can  make  full  and 
effective  use  of  the  capabilities  of  these  systems. 
Although  the  systems  may  vary  significantly  in  design,  the 
functions  performed,  i.e.,  definition,  creation,  retrieval, 
storage,  modification,  and  manipulation  of  the  data  base, 
are  similar  in  all  systems,  'Therefore,  what  is  needed  is  a 
tool  that  will  allow  the  teaching  of  these  concepts,  and 

provide  ’hands-on’  experience  in  using  a  data  base 

management  system.  Such  a  tool  should  be: 

(a)  Educational  ;  it  should  provide  the  facilities  for 
teaching  different  data  base  concepts. 

(b)  Accessible  ;  it  should  be  available  from  a 

conversational  terminal. 

(c)  flexible  ;  it  should  be  easy  to  use  both  in 

manipulating  data  as  well  as  in  defining  and 
entering  it  into  the  data  base. 

(d)  Inexpensive  ;  to  be  useful  for  teaching,  it  must  be 
inexpensive  to  use  for  student  assignments.  Some 
of  the  existing  data  base  management  systems  are 
expensive  to  operate  in  such  an  environment. 
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(e)  Eealistic  ;  it  should  reflect  the  facilities 
available  in  data  base  management  systems  currently 
in  use. 

1-7  e  propose  an  Educational  Data  Ease  System  (EDES) 
embedded  in  the  host  language  APL  which  can  be  used  as  a 
tool  for  teaching  data  base  concepts.  It  supports  two 
logical  data  organizations,  hierarchical  (based  on  VAhDL-1 
[13]  and  I?1S  [8])  and  relational  (based  on  Coda’s  relational 
model  of  data  [ 3 ])  ,  and  provides  all  the  facilities 
necessary  to  define,  create,  retrieve,  store,  modify,  and 
manipulate  data  in  a  data  base. 

Hierarchies  are  supported  because  this  logical  data 
organization  is  in  wide  use  in  many  systems  today  (IHS, 
TDMS) ,  and  relations  are  supported  because  they  are  a  ’nice' 
way  of  describing  and  discussing  data  [14].  Certain  data  is 
also  represented  nicely  by  hierarchies,  e.g.,  a  company’s 
organizational  structure,  while  ether  data  is  better 
represented  by  relations,  e.g.,  accounting  data.  EDES  will 
allow  a  user  to  access  the  same  data  base  either 
hierarchically  or  relaticnally ,  or  will  allow  independent 
h ier archie  and  relational  data  bases. 

APL  was  chosen  as  the  host  language  because  its 
implementation  provides  a  stable  terminal  support  system. 
In  addition,  the  APL  language  is  well  defined  [6]  and  the 
implementation  we  use  (APL/360)  seems  reliable.  APL  is  also 
a  very  non-procedural  language  which  should  make  its 
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assirai lat i cn  by  first  time  rsers  easier.  In  our 
environment,  API  is  provided  as  the  APL+  system  [12]  which, 
in  addition  to  the  facilities  provided  by  API,  provides  for 
the  creation,  accessing,  modification,  and  manipulation  of 
files  on  secondary  storage,  i.e.,  a  file  system. 

EDSS  is  not  intended  as  a  ccmm ercially  ccrnpatitive  data 
base  management  system.  It  is  a  tool  for  teaching  data  base 
concepts.  Data  bases  in  the  system  will  be  small,  on  the 
order  of  thousands  of  entries  rather  than  hundreds  of 
thousands  as  in  commercial  systems.  Because  of  this,  the 
system  does  not  require  any  great  optimisation  strategies  or 
particularly  efficient  allocation  of  storage.  In  fact,  in 
the  design  and  implementation  of  the  system  when  a  choice 
had  to  be  made  between  a  highly  complex  design  or  algorithm 
that  might  be  difficult  to  implement  and  a  simpler  perhaps 
less  efficient  design  or  algorithm,  the  simpler  method  was 
almost  invariably  chosen.  This  is  compatible  with  our 
objective  of  getting  a  working  prototype  system  in  as  short 
a  time  as  possible. 

In  view  of  the  purpose  of  our  system  some  requirements 
are  emphasised.  In  particular,  the  system  should  provide: 

(1)  Easy  access  to  the  system. 

(2)  Easy  access  to  the  data. 

(3)  An  easy  to  use  facility  for  describing  data  to  the 
system. 
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(4)  simple  mechanism  for  initially  loading  data 
bases . 

(5)  Seasonable  performance  characteristics  for  both  the 
hierarchical  and  relational  subsystems. 

2 .  Orcanizaticn  of  the  System 

An  outline  of  the  system  showing  its  functional  parts  is 
given  in  fig.  2-1.  Heavily  used  data  paths  are  indicated  by 
the  symbol  <=>. 
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tig.  2-1.  FIBS  structure. 


API  sign-cn  procedures  are  used  to  identify  the  user  and 
to  gain  access  to  the  comput'er.  Once  signed-cn,  the  user 
loads  a  co^l^  of  EDBS  (either  hierarchical  or  relarional 
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subsystem)  into  his  APL  wcrhspace.  This  puts  him  into  User- 
mode  and  sets  up  an  internal  control  block  for  this  user. 

After  the  above  initialization,  the  user  is  given 
control  of  the  terminal  which  is  now  ready  to  accept  APL 
commands  or  calls  to  either  the  hierarchic  or  relational  Dl^L 
(Data  Manipulation  Language)  Interpreter.  The  results  of 
the  call  (if  any)  are  returned  to  the  user  in  a  user 
specified  output  file.  Status  information  on  the  result  of 
any  call  is  always  returned  to  the  user  (see  Sec.  3.)  . 

Specially  designated  users  are  allowed  to  access  the 
system  in  LEA  (Data  Base  Administrator)  mode.  This  is  a 
special  control  mode  which  allows  the  user  to  define  and 
delete  data  definitions  and  to  inspect  logged  data.  Using 
his  data  definition  capabilities,  the  Data  Ease 
Administrator  also  ensures  the  uniqueness  of  data  base  names 
in  the  system. 

The  storage  structure  of  EIES  is  confined  by  the  APL+ 
file  system.  Piles  in  APL+  are  a  collection  of  APL  data, 
organized  into  components.  A  component  of  an  APL+  file  may 
be  any  API  value,  any  scalar,  vector,  matrix  or  higher 
dimensional  array,  holding  either  characters  or  numbers  but 
not  both  [12],  The  physical  storage  of  data  is  performed  by 
the  APL+  file  system.  How  the  data  in  the  file  components 
is  actually  organized  on  disk  is  irrelevant  to  ED3S  since 
the  APL+  file  system  ’  does  all  data  accessing  and 
organization. 
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A  hierarchical  iata  hass  [8]  as  defined  in  ED33  is 
ccmpossd  cf  data  case  records.  Data  bass  records  consist  of 
fixed  size  seqnents  residing  at  hierarchic  levels.  The 
highest  level  segment  is  referred  to  as  the  root  segment. 
Segments  are  further  divided  into  fields.  fields  are  of 
three  types:  integer;  date;  string.  Each  root  segment  has 
one  lev  f ield  which  uniquely  identifies  the  root  segment 
within  a  hierarchical  data  base.  Ncn-roct  segments  contain 
9-  field  which  is  unique  within  a  given  parent. 

^  relational  data  base  [3]  in  EEES  is  defined  by  a  set 
of  relations,  where  a  relation  is  a  subset  of  the  Cartesian 
product  SI  X  S2  X  ...  X  Sn  {SI,  S2,  .  .  .  Sn  sets).  Sj  is 
referred  to  as  the  jth  do  rrain  of  the  relation.  Like  fields, 
domains  may  be  one  of  the  data  types  integer,  date,  or 
string.  A  Candida^  k^T  ^  relation  is  a  combination  of 
one  or  more  domains  used' to  uniquely  identify  a  tuule  (an 
instance  cf  SI  x  S2  x  ...  x  Sn)  of  the  relation. 

The  implementation  of  a  logical  data  base  (hierarchic  or 
relational)  in  EEES  consists  of  the  user’s  data  plus  the 
tables  that  allow  him  to  access  this  data.  Since  the 
representation  of  the  logical  structure  of  the  data  and  the 
actual  physical  data  are  stored  separately  in  ED3S,  seme 
mechanism,  for  example  pointers,  is  necessary  to  establish 
the  hierarchic  and  relational  structure  cf  the  physical 
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Lcwenthal  in  [10]  defines  a  ;ti§cs  as  a  means  of 
specifying  the  structure  of  hierarchic  data.  -A  trace  can  be 
thought  cf  as  an  address  associated  with  a  segment  in  a 
hierarchical  data  case  v/hich  allows  access  to  the  data  in 
the  segment  and  provides  seme  infermatien  cn  the  ancestry  of 
the  segment.  Por  a  segment  at  level  n  in  a  hierarchy,  a 
trace  consists  of  n+1  integers  that  uniquely  identify  that 
segment  in  the  data  base. 
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Fig.  2-2.  Sample  hierarchic  data  base. 


Consider  as  an  example,  a  data  base  as  in  fig.  2-2., 
where  each  segment  type  is  identified  by  a  unique  number 
^12®  ~  e.g.,  the  FMELOYFE  Type  Code  is  1  etc.  To 

simplify  implementation,  we  restrict  the  maximum  number  of 
occurrences  of  a  root  segment  type  and  any  of  its  son 
segment  types.  This  maximum  number  is  specified  at  the  time 


the  data  base  is  created.  Thus  in  fig.  2-2.  there  can  be  at 
most  ten  occurrences  of  the  segment  type  EMPLOYEE  although 
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there  may  be  less.  Under  any  occurrence  of  an  EMPLOYE'S 
segment  there  can  be  at  most  5  occurrences  of  the  J03HIST0RY 
segment  type  etc. 

Let  n1  tahe  on  the  values  1  to  10,  n2  the  values  1  to  5, 
n3  the  values  1  to  2,  and  n4  the  values  1  to  5.  Note  that 
in  each  instance  the  variables  n1,  n2,  n3,  and  n4  can  take 
on  a  value  from  1  to  the  naxiraum  number  of  occurrences  of 
one  of  the  segment  types  in  fig,  2-2. 

A  trace  for  an  EMPLOYEE  segment  may  have  the  values 
l.nl.,  a  JOEHISTOSY  segment  2.n1.n2.,  a  SALAS YHISTORY 
segment  3.n1.n2.n3.,  and  a  CKIIDSEN  segment  4.n1.n4.  The 
first  number  of  the  trace  is  the  unique  Type  Cods  assigned 
to  each  segment  type.  The  remai ning  numbers  in  the  trace 
indicate  the  instance  of  a  segment,  at  a  given  hierarchic 
level,  under  a  given  parent.  This  latter  sequence  of 
numbers  is  not  unique,  but  given  a  unique  Type  Code,  it  does 
uniquely  identify  a  segment. 

Traces  in  EEBS  are  not  encoded  directly  for  storage  in 
the  data  bass.  Rather,  they  are  a  mechanism  for  specifying 
a  certain  position  or  ’slot’  in  an  array.  For  each  segment 
type  at  level  n,  one  needs  an  n-d imensional  array,  hereafter 
referred  to  as  a  Position  Matrix ,  to  uniquely  identify  each 
instance  of  a  segment  type.  The  Type  Code  of  the  traces 
uniquely  identifies  a  Position  Matrix  and  the  remaining 
numbers  specify  a  ’slot*  in  the  Position  Matrix.  Each 
‘slot*  in  a  Position  Matrix  contains  an  index  into  a  Data 
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hierarchical  data  ’case  is  referred  to  as  an  Recess  Table. 

We  use  an  array  to  implement  the  traces  because  APL 
manipulates  arrays  as  conveniently  as  scalars.  Also  the 
manner  of  accessing  arrays  in  APL  corresponds  nicely  to  a 
trace,  i.e.,  we  access  an  array.  A,  by  A[ n 1 ; n2 ; n 3 ; . ,  ,  ]  and 

we  specify  a  trace  by  n1.n2.n3 .  ether  methods  of 

encoding  traces  are  also  possi’Dle  [10], 

Some  additional  tables  are  used  to  interpret  the 
hierarchical  and  relational  commands.  These  tables  comprise 
the  Data  Ease  Definitions  (EED)  of  the  system,  and  specify 
the  logical  to  physical  mapping  of  data.  Their  complete 
specification  can  be  found  in  [9]. 

Relations  on  the  hierarchic  structure  can  also  be 
defined  using  the  representation  of  traces  by  an  Access 
Table.  Each  segment  of  a  hierarchical  structure  defines  a 
relation  and  thus,  each  ’slot’  in  the  Access  Table,  along 
with  the  key/seguen ce  fields  of  hierarchically  higher 
segments,  defines  a  relation  [3].  We  restrict  the  types  of 
relations  in  our  system  to  the  ones  corresponding  to  a 
hierarchical  path  in  a  natural  way.  This  affects  the 
generality  cf  our  system,  but  it  greatly  simplifies 
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All  key  fields/sequence  fields/candidate  key  dcraains  are 
kept  as  inverted  lists  to  aid  in  the  searching  of  the  data 
base.  Search  on  any  f ield/dc la in  in  the  data  base  is 
possible,  but  if  the  field/do ma in  is  not  inverted  the  search 
will  be  slow.  Since  the  data  bases  will  be  relatively 
small,  the  inverted  lists  will  not  be  very  large  and 
therefore,  should  not  be  difficult  to  search.  dost 
interrogation  is  expected  to  be  on  key  f iel ds/sequence 
fields/candidate  key  domains. 

Inverted  lists  in  ICES  are  implemented  by  threaded 
binary  trees.  Each  node  in  the  tree,  besides  containing 
link  information,  contains  an  attribute  value  and  the 
trace  (s)  of  the  segment (s)  having  that  attribute  value.  The 
same  inverted  lists  are  used  fcr  hierarchic  or  relational 
access  to  a  data  base. 

Data  Files  store  the  actual  data  instances,  the  Access 
Table,  and  inverted  lists  fcr  a  data  base.  There  is  one 
Data  File  for  each  data  base.  The  Access  Table  and  inverted 
lists  for  a  data  base  are  kept  in  the  same  file  as  the  data 
sc  that  when  a  data  base  has  to  be  locked  fcr  modification 
only  this  one  file  need  be  locked  and  access  to  other  data 
bases  by  ether  users  is  still  possible.  Data  instances  in  a 
Data  File  are  in  a  random  order,  i.e.. 
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3.  DML  Interpreter 


The  DMI  (Data  Manipulation  language)  Interpreter  accepts 
ccmmands  frcm  the  -API  terminals  that  access  a  data  base.  In 
processing  the  EMI  commands,  the  DML  Interpreter  consults 
the  Data  Ease  Definitions,  inverted  lists,  and  Access  Tables 
in  order  to  interpret  the  commands  and  manipulate  the 
desired  information.  The  DML  Interpreter  accesses  the  data 
case  by  means  of  the  APL+  fils  system  access  methods. 

The  execution  of  every  DML  command  causes  a  system- vide 
global  variable  STATUS  to  be  set.  This  variable  is 
accessible  to  the  user  and  is  used  by  him  to  control  the 
execution  of  his  program.  If  the  DML  command  succeeds, 
STATUS  is  set  to  0 ;  if  it  fails,  STATUS  is  set  to  1. 

The  occurrence  of  an  error  causes  execution  of  a  DML 
command  to  be  aborted.  Mo  changes  to  the  data  base  or 
system  variables  except  STATUS  and  MESSAGE  (described  below) 
occur.  Syntax  errors  are  communicated  directly  to  the  user 
via  his  terminal.  All  other  errors,  e.g.,  end  of  data  base, 
that  cause  a  DML  command  to  fail  are  not  com mu n ica ted 
directly  to  the  user,  but  cause  a  system- wide  variable 
MESSAGE  to  be  set.  This  variable  contains  a  string  message 
indicating  the  nature  of  the  error. 
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3.1.  Retrieval  ccmmands 

Retrieval  ccromands  provided  are  those  that  locate  and 
access  data  and  those  that  locate  and  access  and  place  the 
data  base  in  a  hold  status  for  nodi f  icatic n .  No  conimands 
that  require  a  user  to  have  explicit  knowledge  of  pointers 
into  the  data  base  are  provided.  Por  reasons  of  security 
and  integrity,  as  well  as  to  simplify  data  manipulaticn,  it 
is  felt  that  a  user  should  not  have  explicit  knowledge  or 
control  of  pointers  into  the  data  base. 

The  interface  between  ICES  and  API,  as  far  as 
transferring  data  between  the  two  is  concerned,  is  handled 
through  an  API-?-  file  (<output  file>)  specified  by  the  user 
for  use  by  IDES.  For  hierarchical  data  bases,  only  one 
segment  is  retrieved  for  every  retrieval  command  issued. 
However,  since  the  retrieved  data  is  appended  to  the  <output 
file>,  <output  file>  may  contain  more  than  one  segment.  For 
relational  data  bases,  a  value,  set  of  values,  or  set  of 
tuples  of  a  relation  is  retrieved. 

Data  retrieved  may  be  qualified  by  specifying  a 
< qualification  expression>.  Conditional  operators  allowed 
are  <,  <,  >,  >,  and  =.  Only  #  and  =  are  allowed  for 
string  data.  logical  connecters  provided  are  AND  and  OR. 

A  relation  in  itself  is  useful  for  storing  and 
retrieving  data.  However,  the  real  power  in  a  relational 
data  base  results  from  being  able  to  extract  related  data 
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simultaneously  from  two  or  more  relations.  Such  extracrion 
of  data  from  two  relations  simultaneously  is  referred  to  as. 
a  Join .  Various  types  of  joins  may  be  possible  on  two 
relations,  but  in  EDBS  we  restrict  a  join  to  be  a  narural 
join  [3]. 

As  Parker  and  Jervis  [11]  point  out,  obtaining  a 
meaningful  join  of  two  relations  is  not  always  a  simple 
procedure  or  even  possible.  The  difficulties  arise  when  the 
two  relations  to  be  joined  have  no  domain  in  common.  To 
simplify  relational  retrievals  in  ECES  we  make  the 
restriction  that  a  join  involve  at  most  two  relations  and 
that  these  relations  have  a  common  domain.  Joins  in  EDSS 
are  implicit. 

The  retrieval  commands  provided  will  now  be  described. 
In  the  following  commands,  optional  fields  are  enclosed  in 
{} .  The  complete  syntax  of  these  commands  can  be  found  in 
[9]. 

3,1.1.  Hierarchical  retrieval  commands 

(i)  GU  (GET  uniQUEl  ’<output  f ile> ; <segmen t  nams>; 

<gualif ication  expression>' 

This  command  retrieves  a  specific  occurrence  of  the  root 
segment  <segment  name>  with  qualification  <gualif icat ion 
exprassion>  from  the  data  base  and  places  it  in  file  <output 
file>.  Only  key  <field  namss>  are  allowed  in  the 
<qualif ication  expression>,  and  keys  must  he  unique.  A 
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position  pointer  is  estatlished  in  the  data  base  which 
points  to  the  qualifying  segment.  Every  user  has  a  position 
pcinteTj,  that  is  maintained  and  used  by  the  system^  into  the 
data  base.  This  pointer  is  stored  in  the  user’s  control 
b  1  oc  k  . 

{ii)  GN  (GET  NEXT)  ’<output  f ile>  { ; <segment  nams> 

{  ; <gualif ication  ex pression>} } * 

This  retrieves  the  hierarchically  ’next’  segment,  the 
’next’  segment  <segment  name>,  or  the  ’next’  segment 
<segmsnt  name>  with  qualification  <guali f icat ion 
expression>.  ’Next’  is  determined  by  the  current  value  of 
the  data  base  position  pointer.  Hierarchic  ’next’  is 
defined  by  traversing  the  hierarchic  structure  from  top  to 
bottom,  left  to  right  [8], 

(iii)  GNP  (GET  NEXT  WITHIN  PT^EENT)  ’<output  file> 

{;<segment  name> {; <gualif icat ion  expression>} } ’ 

This  retrieves  the  hierarchically  ’next’  segment  within 
a  specific  parent  -  set  by  the  last  GET  UNICUE  or  GET  NEXT 
call.  Either  the  ’next’  segment,  ’next’  segment  <3agment 
name>,  or  ’next’  segment  <segraent  narae>  with  qualification 
<qualif ication  expression>  is  retrieved. 
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(iv)  GET  BOLD 


The  GET  HOLD  command  has  the  forms  GHU  (GET  HOLD 
UNIQUE)  ,  GHN  {GET  HOLD  NEXT)  ,  or  GHNP  {GET  HOLD  NEXT  NITHIN 
PP.EENT)  with  the  same  arguments  as  the  normal  retrieval 
commands.  The  segment  is  placed  in  a  hold  status  ready  for 
modification.  The  Data  Tile  for  the  data  tasa  is  locked. 
No  nested  GST  HOLE  commands  are  allowed. 

(V)  RELEASE 


Anytime  after  a  GET  HOLD  command  and  before  the 
modification  ccmmand,  a  RELEASE  command  may  be  issued  which 
unlocks  the  data  base  and  cancels  the  planned  modification. 

3.1.2.  Relational  retrieval  commands 

(i)  GET  ’<output  f ile>  {  ; <limit>}  ; <target  list> 

{ ;  <gualif ication  exprsssicn>}  ^ 

A  value,  set  of  values,  or  set  of  tuples  cf  a  relation  ~ 
<target  list>  -  is  retrieved  from  the  data  base  and  placed 
in  <output  fil6>.  The  retrieval  may  be  qualified  so  that 
only  certain  values  are  retrieved,  and  so  that  only  a 
certain  number  <limit>  are  retrieved. 

(ii)  HOLD  ’<output  f ile>  {  ;<lin:it>]  ;  <target  list> 

[; <gualif ication  expression>} ’ 

This  performs  the  same  function  as  GET  except  that  the 
system  is  signalled  that  modification  is  to  occur  and 
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retains  enough  information  to  do  the  modification.  The  data 
requested  is  placed  in  <output  filch  in  the  order  retrieved. 
The  system  retains  this  order,  along  with  a  matching  list  of 
candidate  key  domain  values  (for  integrity  purposes) ,  and  an 
i ndex  into  the  Data  Pile  from  where  the  data  was  retrieved. 
The  user  is  not  allowed  to  change  the  order  of  the  retrieved 
data  or  else  an  error  will  result.  No  nested  HOLD  commands 
are  allowed. 

In  order  to  simplify  update,  no  joins  are  allowed  for  a 
HOLD  command.  A  join  in  itself  is  a  complicated  and  time 
consuming  operation  in  IDES,  Combining  it  with  a 
modification  command  would  degrade  response  time  to  the 
point  .where  such  on-line  modifications  would  have  to  be 
ruled  out. 

(iii)  EELIASE 


Anytime 

after  a 

data  base  is  placed  in 

HCLD 

status. 

and 

before  a  modi 

f ication 

command,  a  RELEASE 

command  may 

be 

issued  that 

cancels 

the  planned  update 

and 

unlocks 

the 

locked  data  bass. 

3.2.  Modification  commands 

Add  (insert  only)  ,  Chance,  and  C elete  are  the  only 
mcdificaticn  commands  crovided.  A  Reorder  command  is  not 

X.  - —  —  ■'  ■  — • 

provided  since  it  is  felt  that  such  a  function  can  be 
performed  in  the  host  language  once  the  data  has  been 
retrieved.  No  capability  to  alter  data  definitions  online 
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is  included  in  the  system  at  present.  The  use  of  such  a 
capability  should  be  restricted  to  EB.A-mcde  users  only. 

In  modifying  data  in  the  data  base,  the  user  is  required 
to  follow  a  few  simple  rules  such  as  not  changing  the  order 
of  the  dara  to  be  updated  and  not  changing  the  hey 
f ields/sequence  fields/candidate  hey  domains  when  using 
certain  modification  commands. 

Dynamic  allocation  of  secondary  storage  is  only 
necessary  in  IDBS  when  inserting  a  new  segment/relation 
instance.  The  process  consists  of  merely  appending  a  new 
segment/relaticn  to  the  Data  Tile,  indicating  in  the  Access 
Table  that  a  new  segment/relation  has  been  inserted  (only  a 
certain  maximum  number  is  allowed)  and  updating  the 
appropriate  inverted  lists.  The  append  operation  in  APL+  is 
a  simple  command  (FAPPEDD) .  The  updating  of  the  Access 
Table  and  the  inverted  lists  is  the  major  problem  in 
insertion . 

Concurrent  modification  and  maintaining  the  consistency 
of  the  data  base  is  a  major  problem  that  has  no  simple 
solution.  In  TIES,  the  problem  of  consistency  only  arises 
when  a  modification  requires  the  modif icaticn  of  related 
inverted  lists  and  the  Access  Tables,  The  solution  that  has 
been  adopted  is  to  loch  the  Data  File.  This  file  contains 
all  of  the  data  that  may  need  to  be  changed  as  the  result  of 
a  modification  command. 
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Backup  in  ICES  is  handled  quite  simply  fcy  keeping  an 
extra  copy  of  the  original  data  base.  Since  the  data  bases 
will  be  relatively  small,  it  is  net  a  prohibitive  expense  to 
maintain  an  extra  copy  and  to  restore  this  copy  as  the 
actual  data  base  when  necessary.  This  is  especially 
desirable  from  the  standpoint  of  starting  with  a  ’clean’ 
data  base  say  after  a  class  has  finished  an  assignment. 
Changes  to  the  data  base  can  be  logged  and  time-stamped 
automatically  by  .APL+  so  that  it  is  possible  to  reconstruct 
the  data  base  up  to  the  time  of  failure  if  desired. 

^ •  Conclusions 

ECES  was  designed  to  be  used  for  teaching  purposes 
although  it  could  be  used  in  a  commercial  environment  if  the 
data  bases  are  small.  The  user  does  net  have  to  be 
concerned  with  storage  structure  or  file  accessing.  He 
views  his  data  based  on  a  logical  data  organization  and 
accesses  it  in  the  same  way.  The  format  of  the  data  base 
commands  is  designed  to  fit  ’nicely’  into  the  P.PL  language. 

EEES  is  readily  accessible  from  any  remote  terminal 
capable  of  accessing  the  APL  system.  Sign-cn  procedures 
fellow  those  of  APL.  EEES  is  easily  invoked  simply  by 
loading  a  copy  cf  the  system  into  the  user’s  workspace. 
Allowing  every  user  to  have  his  own  copy  of  the  system 
isolates  a  user  and  prevents  him  from  directly  affecting 
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Flexibility  is  achieved  by  -providing  -two  logical  data 
organ ixati cns  -  hierarchical  and  relational  -  and  by 
providing  a  set  of  commands  for  each  logical  organization 
that  allows  the  user  to  rn.anipnlat e  the  data  in  the  data 
base.  The  two  organizations  can  be  quite  independent  of 
each  other  or  the  same  data  base  can  be  accessed  using  one 
or  the  other  of  the  two  sets  of  ccmmands.  Data  is  defined 
by  an  easy  to  understand  and  use  data  definition  language. 

The  system  has  been  kept  inexpensive  by  keeping  it  small 
and  simple.  Do  great  optimizing  strategies  are  employed  for 
storage  allocation  or  retrieval,  Wherever  possible^  we  have 
used  facilities  already  available  in  APL  such  as  the  file 
system  and  the  security  and  integrity  features. 

A  prototype  system  is  currently  being  implemented. 
Target  date  for  iirple mentation  of  the  system  is  September 
1974.  At  this  time,  we  hope  to  be  able  to  use  the  system  in 
a  data  management  course  at  the  University  of  Toronto. 
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